Annotation-based Distance Measures for Patient Subgroup Discovery in Clinical Microarray Studies

نویسندگان

  • Claudio Lottaz
  • Joern Toedling
  • Rainer Spang
چکیده

MOTIVATION Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of co-regulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease entities based on molecular characteristics. Several distance-based cluster algorithms have been suggested, but little attention has been given to the distance measure between patients. Even with the Euclidean metric, including and excluding genes from the analysis leads to different distances between the same objects, and consequently different clustering results. RESULTS We describe a new clustering algorithm, in which gene selection is used to derive biologically meaningful clusterings of samples by combining expression profiles and functional annotation data. According to gene annotations, candidate gene sets with specific functional characterizations are generated. Each set defines a different distance measure between patients, leading to different clusterings. These clusterings are filtered using a resampling-based significance measure. Significant clusterings are reported together with the underlying gene sets and their functional definition. CONCLUSIONS Our method reports clusterings defined by biologically focused sets of genes. In annotation-driven clusterings, we have recovered clinically relevant patient subgroups through biologically plausible sets of genes as well as new subgroupings. We conjecture that our method has the potential to reveal so far unknown, clinically relevant classes of patients in an unsupervised manner. AVAILABILITY We provide the R package adSplit as part of Bioconductor release 1.9 and on http://compdiag.molgen.mpg.de/software.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

Extracellular exosomes and preeclampsia: a microarray-based study and functional enrichment analysis

Background:  Preeclampsia (PE) is a heterogeneous pregnancy disease which the exact pathophysiology of it is unknown. Recently exosomes have been indicated as a causative factor in the pathogenesis of PE. The aim of the study was to investigate in microarray library data to extract the differentially expressed genes (DEGs) in PE and to perform a functional enrichment analysis to predict the rol...

متن کامل

Association Rule Based Similarity Measures for the Clustering of Gene Expression Data

In life threatening diseases, such as cancer, where the effective diagnosis includes annotation, early detection, distinction, and prediction, data mining and statistical approaches offer the promise for precise, accurate, and functionally robust analysis of gene expression data. The computational extraction of derived patterns from microarray gene expression is a non-trivial task that involves...

متن کامل

O-17: Human Ovarian Tissue Xenotransplantation Application in Drug Discover

Background: Reproductive toxicity studies (RTS) represent an important part of pre-clinical safety evaluation of any drug development process for human consumption. It is mandatory requirement that RTS must be completed before any drug can be administered to women of childbearing age. Therefore, at first segment of toxicity check the toxic effect of the drug on fertility must be tested on non-c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 23 17  شماره 

صفحات  -

تاریخ انتشار 2006